Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task

Y Jiang; Y Ding; C Lei; J Ao; JH Lau; KA Ehinger

Conference Proceedings

Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task

Y Jiang, Y Ding, C Lei, J Ao, JH Lau, KA Ehinger

Proceedings of the Annual Meeting of the Association for Computational Linguistics | Association for Computational Linguistics | Published : 2025

DOI: 10.18653/v1/2025.findings-acl.2

Open access

Abstract

Current Multimodal Large Language Models (MLLMs) excel in general visual reasoning but remain underexplored in Abstract Visual Reasoning (AVR), which demands higher-order reasoning to identify abstract rules beyond simple perception. Existing AVR benchmarks focus on single-step reasoning, emphasizing the end result but neglecting the multi-stage nature of reasoning process. Past studies found MLLMs struggle with these benchmarks, but it doesn't explain how they fail. To address this gap, we introduce MultiStAR, a MultiStage AVR benchmark based on RAVEN, to assess reasoning across varying levels of complexity. Additionally, existing metrics like accuracy only focus on the final outcomes while..

View full abstract

University of Melbourne Researchers

Jey Han Lau Author

Kris Ehinger Author

Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task

Abstract

University of Melbourne Researchers

Related Projects (1)

Empowering Next-Generation Spatial Digital Twins with Linked Spatial Data

Grants

Citation metrics